AITopics | variational information maximizing exploration

VIME: Variational Information Maximizing Exploration

Neural Information Processing SystemsNov-21-2025, 15:13:05 GMT

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

name change, variational information maximizing exploration, vime, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.61)

Add feedback

VIME: Variational Information Maximizing Exploration

Rein Houthooft, Xi Chen, Xi Chen, Yan Duan, John Schulman, Filip De Turck, Pieter Abbeel

Neural Information Processing SystemsNov-21-2025, 08:52:05 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, exploration, machine learning, (14 more...)

Neural Information Processing Systems

Country:

Europe > Belgium > Flanders (0.04)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Reviews: VIME: Variational Information Maximizing Exploration

Neural Information Processing SystemsJan-20-2025, 17:42:50 GMT

The paper shows a pleasant breadth of understanding of the literature. It provides a number of insights into curiosity for RL with neural networks. I think it could be improved by focusing on the development of the variational approach and the immediately resulting algorithm. As is, there are a number of asides that detract from the main contribution. My main concern is that the proposed algorithm seems relatively brittle.

intrinsic reward, posterior, variational information maximizing exploration, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.56)

Add feedback

VIME: Variational Information Maximizing Exploration, Yan Duan

Neural Information Processing SystemsMar-12-2024, 16:00:09 GMT

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as ɛ-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

Add feedback

VIME: Variational Information Maximizing Exploration

Houthooft, Rein, Chen, Xi, Chen, Xi, Duan, Yan, Schulman, John, Turck, Filip De, Abbeel, Pieter

Neural Information Processing SystemsFeb-14-2020, 07:43:51 GMT

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces.

state and action space, variational information maximizing exploration, vime, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.45)

Add feedback

VIME: Variational Information Maximizing Exploration

Houthooft, Rein, Chen, Xi, Duan, Yan, Schulman, John, De Turck, Filip, Abbeel, Pieter

arXiv.org Artificial IntelligenceJan-27-2017

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

exploration, neural network, upstream oil & gas, (17 more...)

arXiv.org Artificial Intelligence

1605.09674

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.36)

Add feedback

VIME: Variational Information Maximizing Exploration

Houthooft, Rein, Chen, Xi, Chen, Xi, Duan, Yan, Schulman, John, Turck, Filip De, Abbeel, Pieter

Neural Information Processing SystemsDec-31-2016

Scalable and effective exploration remains a key challenge in reinforcement learning (RL). While there are methods with optimality guarantees in the setting of discrete state and action spaces, these methods cannot be applied in high-dimensional deep RL scenarios. As such, most contemporary RL relies on simple heuristics such as epsilon-greedy exploration or adding Gaussian noise to the controls. This paper introduces Variational Information Maximizing Exploration (VIME), an exploration strategy based on maximization of information gain about the agent's belief of environment dynamics. We propose a practical implementation, using variational inference in Bayesian neural networks which efficiently handles continuous state and action spaces. VIME modifies the MDP reward function, and can be applied with several different underlying RL algorithms. We demonstrate that VIME achieves significantly better performance compared to heuristic exploration methods across a variety of continuous control tasks and algorithms, including tasks with very sparse rewards.

artificial intelligence, exploration, upstream oil & gas, (15 more...)

Neural Information Processing Systems

Industry: Energy > Oil & Gas > Upstream (0.36)

Technology: